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Abstract. We introduce a hierarchical notion of formal proof, useful 
in the implementation of theorem pr overs, which we call hiproofs. Two 
alternative definitions are given, motivated by existing notations used in 
theorem proving research. We define transformations between these two 
forms of hiproof, develop notions of underlying proof, and give a suitable 
definition of refinement in order to model incremental proof development. 
We show that our transformations preserve both underlying proofs and 
refinement. The relationship of our theory to existing theorem proving 
systems is discussed, as is its future extension. 
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1 Introduction 

The activity of tactical theorem proving is, inherently, a hierarchically organised 
one. Its hierarchical structure is inherent both in the way composite tactics 
are defined to consist of other tactics, and in the way that tactics are then 
sequentially applied to successions of goals. 

Accordingly, a number of proof assistants offer direct support for some no- 
tion of hierarchy in their data structure and language of tactics; see, for example, 
[CSOO,RSG98,KNM94]. What has been lacking, however, is a sufficiently concep- 
tual and abstract characterisation of hierarchy in mechanised proofs which, by 
being independent of implementation details, would in the long run underpin a 
larger theory to support the art of tool building. In particular, we believe this 
to be useful for developing interfaces: both graphical interfaces for individual 
theorem provers, and interfaces between theorem provers. 

A first approach at such a characterisation, which is both mathematically 
precise and lends itself to diagrammatic visualisation, is the aim of this paper. 

Consider, for example, how Fig. 1(a) could be understood as depicting a 
proof by induction. In this representation we emphasise the tactics rather the 
original goal and subsequent proof states, and so the nodes in the diagram are 
labelled by tactic identifiers. Inclusion of one node in another indicates a subtac- 
tic relationship, and the arrow indicates sequential composition: the tactic at the 
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source of the arrow is invoked first, then the tactic at the target. The diagram 
is then read as saying that, at the most abstract level, the proof is obtained by 
invoking an induction tactic, Induction. This consists of applying an induction 
rule, Ind-Rule, which then generates two subgoals. The first subgoal is handled 
by the Base tactic, the second by the Step tactic. In turn, Step is defined as 
first applying the Rewrite tactic, and then the Use-Hyp tactic. For the purposes 
of this example, we regard! Base~7~ Rewrite and Us”e-Hyp"as primitive^ 



(a) 


(b) 


Fig. 1. Two Hierarchical Proofs 


A slightly more complex example is given in Fig. 1(b). At the most abstract 
level, the proof consists of applying Tl, and then DP. The tactic T1 first applies 
T2, generating two subgoals, the first of which is handled by WF. The second is 
handled by DP, which applies Normalise and then Taut. 

We will refer to these hierarchical representations of proofs as hiproofs. Obvi- 
ous questions to ask are: what are the acceptable well-formed hiproofs? what are 
their mathematical properties? what operations can we perform on these struc- 
tures? what is the relationship between such a hiproof and the underlying logic? 
Indeed, such diagrams are not the only possible hierarchical representations for 
proofs, and we should ask if there are any substantial differences between the dif- 
ferent approaches. We will address most of these questions in this paper, giving 
formal definitions for two different notions of hiproof, and proving an equivalence 
theorem. 




Before doing so, however, it is instructive to take one closer look at the 
relationship between hiproofs, tactics and standard notions of formal proof, such 
as proofs in natural deduction style. 


Example L Consider a natural deduction proof of A => A A {x — x), as in 
Fig. 2. The obvious (backwards) proof is implication introduction, followed by 


‘subgoals. The essential information of the proof is the sequence of inference rules 
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Fig. 2. A simple natural deduction proof. 


which are applied, and we can represent the order of these rules as a tree, like in 
Fig. 3(a). Typically, however, theorem provers allow the use of higher-level tactics 
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Fig. 3. Introducing hierarchy in proof diagrams by grouping. 


which group together the application of a number of low-level inferences. For 
example, it is common to have an Intros command, which performs all possible 
introduction rules. We can indicate this on the proof diagram by grouping Imp- 1 
and And-I together, as in Fig. 3(b). We could go further and define a tactic, Prop, 
which first calls Intros, and then tries to use axioms wherever possible. This 
gives the hierarchical structure shown in Fig. 3(c). □ 







Compared to the hierarchical structures which are implemented in state- 
of-the-art theorem provers, our hiproofs abstract away from some of the more 
concrete and operational aspects of those structures. These key abstractions, 
made necessary so as to arrive at a unifying and implementation-independent 
notion, are the following: 

Inhiproofe ? w(^Qnlypioddta<^ics.nQtgQa l s.-This^is-fir&tlybecausG^a^natiJorL 

of hierarchy, per $e, should ideally be independent of the underlying logic. 
Secondly, tactics themselves possess a rich structure, and it is our intention 
to find the simplest possible framework in which we can study this. This 
framework may then be used to assist the development of complex tactics in 
a top-down, hierarchical fashion. 

- In our framework, a tactic is thought of as a black box, and so none of its 
implementation details are listed, except to record which other tactics it may 
be defined in terms of. In general, we also regard inference rules and axioms 
as primitive tactics. 

- Hiproofs model static structure, not dynamics. Thus our diagrams represent 
only the sequence of tactic applications which led to a successful proof, and 
do not indicate the tactic definitions themselves or give any information 
about proof search. 

- Hiproofs are structures on proofs, not on particular implementations of 
proofs. Thus, we have begun by regarding the underlying proofs to be trees, 
as is standard in the study of formal logic, whereas some systems implement 
proofs as dags rather than trees, and some allow and/or trees to model proof 
search. Nothing in our treatment is specific to the particular notion of basic 
proof we have taken, let alone specific to any implementation technology for 
proofs. 

We now proceed to explain how the the key features of hiproofs can be 
intuitively understood in terms of the features of the proof diagrams in Figs. 1(a) 
and 1(b). (Although we will be motivated by the diagrams, we make a decision 
to take non-graphical definitions as basic, and so, abstract away from various 
graphical representations to the underlying mathematical structure.) To this 
end, we make the following observations: 

- First, there is no requirement for uniqueness of tactic identifiers, since a tactic 
can certainly be applied repeatedly in a proof. However, we will informally 
refer to proof nodes by their tactic identifier where there is no ambiguity. 

- There are only two relationships which can hold between nodes. In the dia- 
grams, inclusion indicates the unfolding of a tactic into its definition, while 
the arrows indicate sequential composition . For example, in Fig. 1(b), the 
decision procedure, DP, unfolds to give the composition of Normalise with 
Taut. 

- Like Figs. 1(a) and 1(b), hiproofs are essentially tree-like, that is, subgoals 
are independent : a tactic acts on a single subgoal 3 . Normally, tactics are 

3 Of course, in general, this is not the case, and such extensions will be the subject of 
future work. 



thought of as returning a list of subgoals. Here, though, we abstract away 
from the order and do not place any order on child tactics. 

A hiproof, therefore, consists of a finite collection of tactic-labelled nodes, 
related by inclusion and composition. Although the diagrams represent abstract 
versions of full proofs, we will be interested in how such proofs are built up, so 
- . also consider intu i tively part ia l proofs t . o he well-for me d 

Related. Work The use of diagrams in logic is not new, and is discussed, in 
general, by [BH96]. An important point, here, is that hierarchical structure can 
be put on a basic proof in different ways. Two hiproofs can be quite different, yet 
have the same underlying proof. This is in contrast to Fitch-style boxed natural 
deduction proofs, where there is only one way to draw the boxes on a given 
proof. 

Hierarchical proofs, such as we consider here, are popular with the proof plan- 
ning [Bun96] community. For example, [RB99] uses two different representations. 
There has not been much analysis of the algebraic features of proofs, however. 
An exception is [RS01], which concentrates on the dynamics of a particular rep- 
resentation language. Our concern, here, is with static structural properties. 

Bearing the above general differences in mind, we now take a look at how 
hiproofs generalise the structures found in some of the most popular systems. 
In general, any system (such as Coq, HOL and PVS) in which tactics may be 
defined from other tactics leads naturally to the kind of hierarchical proof which 
is central to our theory. 

More specifically, the notion of hierarchy which we have formalised here is 
similar to that underlying the Lambda Clam family of proof planners [RSG98]. 
Yet it is more general, since Lambda Clam does not (currently) allow tactics to 
leave open goals, as we do in Fig. 1(b), for example. The Tecton system [KNM94] 
also supports hierarchical proof, though its hierarchy is only ‘two level 7 and not 
arbitrarily nested like here. The PDS data structure [CSOO], on the other hand, 
implemented in Omega [BCF+97] and other systems, is of similar generality 
to our hiproofs, but is less abstract. A PDS consists of names , sequents , and 
elements called justifications and reasons . Of those only named nodes and justi- 
fication elements have counterparts in hiproofs. Sequents and reasons implement 
goals and back-tracking, and so have no counterparts in hiproofs. Otherwise, a 
PDS (one which, moreover, happens to be a tree at the lowest level) corresponds 
very closely to the second variant of hiproof we develop. 

The paper is organised as follows. The following section develops two def- 
initions of hiproof, assuming the reader to be broadly familiar with trees and 
forests. (The necessary graph and order theory is summarised in an appendix 
along with some detailed technical proofs.) Section 3 defines transformations 
between our two notions of hiproof, and shows their equivalence. We also in- 
troduce a notion of underlying proof which this equivalence respects. Section 4 
defines refinement of hiproofs and proves that the transformations are refine- 
ment preserving. Finally, Section 5 concludes and discusses how this theory can 
be developed further. 
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Fig. 4. Four non-proofs 


2 Two definitions of hiproof structure 

In this section two definitions of hiproof structure are presented, each corre- 
sponding to a different kind of existing graphical representation. In the next 
section, they will be shown to be equivalent to each other. To enhance the con- 
ceptual clarity of our definitions we will conveniently regard certain trees and 
forests (i.e. sets of trees) as partially ordered sets (posets) and vice versa. The 
formal details on which this interchange depends are, for reasons of space and 
conciseness of exposition, included in an appendix. 

First, we fix a non-empty set A of tactic identifiers (or method identifiers). 
Based on the intuition for tactics outlined in the previous section, we now give 
an analysis of why the diagrams of Fig. 4 are not examples of valid hiproofs. 

Operationally, one tactic is followed by another, which unfolds to give another 
tactic, and so on. Thus we might say that tactics are invoked c at the most abstract 
level'. The first diagram is invalid under this interpretation, then, because if T1 
is followed by T3 and T2 unfolds to T3 then the more abstract T2 should follow 
Tl. Another way of putting it is that it would be permissible for T3 to follow Tl, 
but then the fact that T2 is an abstraction of T3 is irrelevant to this proof and 
should not be added after the composition of Tl and T3. 

Conversely, when a tactic finishes executing, control flows from the most 
recently executed tactic, i.e. the innermost, outwards, but the second diagram 
does not follow this convention. 

The third diagram is also invalid: either Tl unfolds to T2 or T2 follows se- 
quentially, but not both. 

Finally, the fourth example fails as well, for the reason that tactic Tl must 
unfold to give a unique subsequent tactic to execute, not two. To make this 
precise we say that T2 and T3 are siblings under Tl, a notion which we generalise 
from trees to forests (i.e. sets of trees): 

Definition 1. Let F be a forest , considered equivalently as a graph (V, -*) or 
set {Ti,...,T n } of trees. The predicate isrootp{v) (also written isroot when 
the context of is understood to be F) is defined to hold if and only if there 
is tree Tj = (Vj,->j,rj) in F such that v = rj; or , equivalently , when no v 0 



exists such that Vo v. Further , we say v and v* are siblings in F, and write 
siblings F (v,v f ) (or siblings , v*) when the context of -» is understood to be 
F ), ifv and v f have the same parent ( i.e . 3v n . v" — )• v and v” -¥ v ! ), or are both 
roots . □ 

Notice, in particular, that every node v is trivially a sibling of itself. 

O 

with pictorial intuition, emphasises the reflexive, antisymmetric and transitive 
nature of spatial inclusion as a partial order, while using a graph structure for 
sequential composition: 

Definition 2. A hiproof of type 1 consists of a forest qua poset i = (V, <i) and 
a forest s = (V, -^ s ), together with a function t : V ->• A which labels the nodes 
in V with tactic identifiers in A . (The relations <i and -» s correspond to the 
inclusion and sequentiality relations among tactics.) These data are subject to 
the following additional conditions: 

1. Inclusion and sequence are mutually exclusive: whenever v <\ w and (v ->* w 
or w -4* v) then v = w. 

2. Arrows always target outer nodes: whenever v — > s wi and W\ <i w<i then 
v <{ W 2 ; and 

3. Arrows always emanate from inner nodes : whenever wi <\ v and v -> s W 2 
then v = wi; and 

4 . Given any two nodes v and v' which both lie at the top inclusion level, or 
are both immediately included in the same node, then at most one of v , v f 
has no incoming -» s edge: 

Vv,v l e V. siblings i (v,v / ) A isroot s (v) A isroot s (v f ) => v — v f □ 

The first condition, equivalently written v <[ w Av ^ w ==> v -ft* wAw ft* 
v , precludes non-proof examples such as that of Fig. 4(c). On the other hand, the 
fourth condition is designed to preclude the situation of Fig 4(d), as well as the 
similar non-proof example obtained from Fig. 4(d) after removing the enclosing 
node labelled Tl. The second condition prohibits that the inclusion hierarchy is 
‘downwards’ transcended by sequentiality (e.g. as in Fig. 4(a)), but still permits 
the situation of Fig. 1(b), where the arrow from the node labelled T2 to the one 
labelled DP ‘upwards’ transcends the inclusion hierarchy. A further interesting 
consequence, proved in the appendix, is the following: 

Lemma 1. Let ( V , <i, — » s , t) be a hiproof of type 1. Then: (1) There is a unique 
node v 0 € V which is both maximal wrt. <\ and has no incoming edge; i.e. 
there is a unique Vo such that isroot^^v 0 ) and isroot^ s {yo); and (2) v — > s W\ 
and Wi G cover < i (w 2 ) => v G cover <.(w 2 ) is equivalent to condition 2 in 
Def. 2. □ 

Let Hiproofj denote the set of hiproof structures of type 1 arising from 
Definition 2 above. 


Example 2 . The example from Fig. 1(b) forms a hiproof of type 1 like so: 



T2 WF Normalise Taut WF DP Taut 


ther - aj^^s^is^nguishJ^etwe e a -t he i nc l usion-and - segi iential rcm^ 
position forests. □ 

Proposition 1. No ‘composite cycles’ exist in a hiproof of type 1: writing v <} 
v ' whenever v E cover <;(*/)> then for all v,v r E V , whenever v {>} U -4 S ) + v' 
one has v ^ v' (here, R+ denotes the transitive closure of a binary relation ). 

Proof ‘ (Sketch) v(>\ U -> s ) + u' means there exists non-empty sequence Vo = 
v, V\, ..., v n = v' such that Vi >\ Vi+i, or Vi — ¥ s Vi+ The proof is by an 
easy induction on the length n of the sequence : in the the base case n = 1, 
v ~ vq ^ V\ — v 1 trivially follows from the inequality and also in the case 
of v — > s v ' as (Vy-> s ) is a forest (and therefore contains no — > s cycles). The 
inductive hypothesis yields Vq / v n -\, in addition to which v n -\ i 1 v n is easily 
obtained by an argument similar to that of the base case. □ 

The definition of hiproofs of type 1 consists of two posets. Because of Prop. 1, 
it is possible to consolidate this in one single tree. As a first attempt at such a 
definition, we might try to represent Fig. 1(b) like so: 

T1 

T2 

/\ 

WF DP 

i 

1 

Normalise 

t 

Taut 

A similar notation is used in [RB99]. The solid lines denote composition and the 
dashed lines inclusion. The problem with this representation is that it does not 
distinguish the similar proof where DP is actually a subtactic of Tl. This can 
be made explicit by pairing tactics with their level in the inclusion hierarchy; 
hence, Tl and DP have level 0, T2 has level 1, and so on. Then we do not need 
to distinguish two kinds of arrow, since this information is implicit from the 
respective levels of adjacent nodes. 

Definition 3. A hiproof of type 2 is a tree (V, r) together with two functions: 
t : V -4 A, which labels nodes with tactic identifiers, and l : V — > N, which 
assigns to each node a natural number, called its inclusion level. (Informally, 
however, we shall often identify a node v with a pair (A ,1), thereby implicitly 
asserting that t{v) = X and l(v) — l.) These data are subject to the following 
conditions : 


1. l(r) = 0, he. the root of the tree lies at inclusion level 0; 

2. whenever v — > v f one has l{v') < l(v) + 1; and 

3. whenever v — » tq , v — » ^2 an d, moreover , l(v 1 ) = Z(v) *f 1, then v\ = ^ 2 - □ 

The second condition states that nodes are only (directly) connected to those 
nodes which they directly include or are composed with. In the latter case, the 
■node can ‘escape 7 'to an arbitrarily lower in clusio n“ie ve irL e t “Hip r o o f 2 denote" 
the set of hiproof structures of type 2, arising from Definition 3 above. 

Example 3. The example from Fig. 1(b) forms a hiproof of type 2 like so 

(T1,0) 

1 

(T2,l) 

/\ 

(WF,1) (DP.O) 

I 

(Normalise, 1) 

I 

(Taut.l) 


where nodes are informally represented by their tactic identifiers and inclusion 
levels. □ 

In Def. 3, both sequentiality and inclusion depth are implicit in the structure 
of the nodes and thus, in terms of their cognitive properties, the diagrams arising 
from hiproofs of type 2 may be less suited to human users than the diagrams 
arising from hiproofs of type 1. In the latter, two distinct visual relations — spa- 
tial containment and edge connectivity : — are used to represent tactic inclusion 
and sequentiality (recall Fig. 1(b)), thus eliminating any potential for confusion. 
Owing to their economy, however, type 2 hiproofs may have distinct advantages 
as internal, machine-oriented representations. 

3 Hiproof mappings and equivalence 


Contrasting Fig. 1(b) and Example 3 seems to suggest that any given proof 
may be ‘equivalently’ represented in either type of hiproof. Indeed, we show 
here that the two definitions of hiproof introduced in the previous section can 
be related by two, mutually inverse functions between the two sets of hiproofs. 
Since Hiproofq and Hiproof 2 are countable sets, this is not very interesting in 
itself, unless the mappings effected by the two functions in question are intuitive, 
meaningful and practically useful. This is indeed so in our case, where the two 
functions we consider intuitively map a hiproof of one type to the alternative 
but equivalent representation of the same underlying proof as a hiproof of the 
other type; and vice-versa. 



Definition 4. Define pi 2 : Hiproof^ — » Hiproof 2 as the function sending 
each hiproof ( V , <i, -> s , t) 0/ type 1 (Def 2 ) to the hiproof ( V , ->,r, t, /) 0/ type 
2 given by the following data: 

- l(v) is defined to equal 0 whenever isroot< .(v) fie. u is a root of the forest 
qua poset (V, <*)/ and , recursively , to egtza/ l (parent <.(v)) -f 1 otherwise ; 

( cxpUeftly-u^foresLquape se A.JiatatiarK-far^ach^uL^ 

unique v such that v r E cover < { (v)); 

- v -* o' whenever v — » s t/ or, t/ € cover < { (v) and isroot-> 3 (v'); and 

- r is the unique node asserted by Lemma 1 . □ 

Definition 5. Define p 2 i ' Hiproof 2 -> Hiproof as the function sending 
each hiproof ( V , ->,r, t, l) of type 2 (Def. 3 ) to the hiproof ( V , <j 5 -> s , t) of type 
1 given by the following data: 

- v -> s v f whenever v -» v f and l(v ( ) < l(v) 

- <i is the reflexive and transitive closure of < i 1 , the latter being defined thus: 

v <1 v f whenever a (non-empty) path v' = vo u n — ► o n+ i = u exists 

such that l(vi) = l(v 0) -h 1 and l(vi) = l(v{+ 1) /or 1 < i < n. □ 

Having established both ^12 and /i2i are well-defined (Propositions 5 and 6 
in the Appendix), we now proceed to show also that they are mutually inverse. 

Theorem 1. The functions p 12 : Hiproof x -* Hiproof 2 and p 21 : Hiproof 2 — »• 
Hiproof are mutually inverse. 

Proof. We only sketch one direction of the argument , as the other is similar. 
Let hi = {V , <i, — K3, t), h 2 = Pi2{hi) = {V 2 ,— »,r, t 2 ) and h[ = P2i(h 2 ) — 
(V', From the definitions , it immediately follows that V = V f = V 2 

and t — t 2 — t 1 . Here we present only the proof of <• C <j (required to show 
<[ — <\) as a representative case. 

Assume v <[ v f and proceed by induction on the length of the path linking v* 
to v in the forest (V, <■). The base case is trivial. For the inductive case , assume 
v <i v n by the induction hypothesis and v” E cover ^ (o'). From the latter and 
Def. 5 , a path v f = vq -4 . . . — > v n — > u n+1 = v” must exist in h 2 such that 
l(v 1) = 1 (vq) + 1 and l(v i+ i) = l(vi) for 1 < i < n. It follows from Def. 4 
that v 1 = vq = parent < .(v 1) or, equivalently, v\ € cover ^(v'). From -* 
and l(vi) — l(v{+i), in turn, follows that and V{+i, for i ranging as above, 
also share v* = vo as parent in <[. Hence , v n E cover Together with the 
inductive hypothesis v n <[ v' , this proves v <\v l . 

Similarly, <* C <[• The other equality -> s = can be shown in a similar 
fashion using inductive arguments. □ 

We have now shown that the maps are mutually inverse. The next step is to 
introduce a notion of Underlying proof’ for hiproofs, and show that the maps 
preserve this. 

We will define the skeleton of a hiproof to be the tree of its atomic tactics. 

If we think of atomic tactics as actually being inferences and axioms then this 
gives us a standard non-hierarchical proof. 


Definition 6. Let h\ = (V, <*, — > s , £) be a type 1 hiproof. We define the skeleton 
of h\ to be the A-labelled tree (Vp,—. >T,r), corresponding (via Proposition 4) to 
the finite poset T = (Vp, <p), where Vp are the leaves of <i, and v\ < v 2 iff 
there exists a v E V such that V 2 <■; v and v\ -4 S v. □ 

It is easily seen that Proposition 4 applies (since the poset has the non-sharing 

^Qn4ition-)^nd^we^mn^comfer^t^-trec. — - — 

We denote the skeleton of a type 1 hiproof, hi, by ski(fti). For example, the 
skeleton of the hierarchical proof at the end of Example 1 is the original basic 
proof. 

Definition 7. Let h 2 — (V, — >,r, t,l) be a type 2 hiproof. Define an inclusion 
node to be one which has a child with greater inclusion level. We define the skele- 
ton of h 2 to be the A-labelled tree (Vp,— >T> r ) where Vp are the non-inclusion 
nodes , and v -¥p v* iff v — > v\ -4 - - * -4 v n — > v' , where vi,.. . , v n are inclu- 
sion nodes, r is the maximum non-inclusion node, and labelling is given by the 
restriction of the labelling function to Vp. □ 

We denote the skeleton of a type 2 hiproof, h 2 , by sk 2 (h 2 ). This too is easily 
seen to be a well-formed tree. And so, we can now formally justify that equivalent 
type 1 and type 2 hiproof equivalence denote the same underlying proof: 

Theorem 2. Let h\ and h 2 be hiproofs of type 1 and 2, respectively . Then, 

1. ski (hi) = sk 2 (/ii2(Ai)) 

2. sk 2 (ft 2 ) = ski(^ 2 i(/i 2 )). 

Proof. By well-founded induction on the hiproofs. At each stage we extend the 
hiproof by one leaf, and show that sk_ and fi_ respect this appropriately. □ 

4 Hiproof refinement 

We would like to introduce structure between hiproofs. The obvious notion of 
structure on a set of hiproofs is one of refinement. Intuitively, this corresponds 
to proof development. Hence hi refines to h 2 when h 2 extends the proof in hi. 

Proofs can grow in two ways: either by unfolding a tactic, or by applying a 
tactic to a subgoal. These correspond, respectively, to inclusion of and sequential 
composition with a tactic. Since we want to formalise a semantic , rather than 
operational, notion of refinement, refinement amounts to allowing trees* to grow 
arbitrarily ‘at the bottom’ and, in the case of forests, adding additional trees. 
The definitions in this section make this precise. For each notion of hiproof 
introduced in the previous section, a relevant notion of refinement is presented, 
and we now ask that the maps preserve hiproof refinement. 

We begin by defining preliminary notions of refinement for trees and forests, 
the simple structures from which hiproofs are constructed. A rooted subtree of a 
given tree is essentially a non-empty subset of the tree’s nodes which is upwards 
closed wrt. - 4 . Formally: 


Definition 8 . A rooted subtree T l of a given tree T ~ (V, — >, r) is a tree (V r , — r) 
where 

— V’ is a non-empty subset of V which is upwards closed : whenever v' 6 V' 
then /or all v" such that v" — > v' one also has v n € V ; and thus also r £ V* 

— — >' is t/ie restriction of to V f x V f . □ 

Henceforth, we refer to ‘rooted subtrees' simply as ‘subtrees’. Intuitively^ 
refining a tree amounts to the addition, at possibly any level below the root, of 
any (finite) number of new nodes. Thus the original tree is always a subtree of 
any tree which refines it: 

Definition 9 . Tree (Vi,— >j,ri) refines to tree (V2, ->2, r 2 ), written 

{Vi > ” r i) Et (^5 ”*2? 7*2) ? 

i/f the former is a subtree of the latter. □ 

Definition 10. Forest F\ refines to forest F 2 , written Fi Cp F2, if there exists 
an injective function 1 : F\ — > F2 such that for all trees T £ F\, T Cx J-(T) □ 

In practice, it is easier to use the following characterisations of forest refine- 
ment, which benefit from regarding a forest as a graph or poset: 

Proposition 2 . Given forests F\ = (Vi,— >1) and F2 = (V2,— >2)7 ^1 Ef F 2 iff 
Vi C V2 and for all v,v ' £ Vi, isrootF 1 (v) isrootp 2 (v) and v — v f 

v -*2 v' • D 

Proposition 3 . Given forests F± = (Vi, <1) and F 2 = (V 2 , <2); Pi Ef P2> iff 
for all v,v f £ Vi, isrootFi(v) t 5 rootir 2 (i;) and v £ cover f 1 (v / ) => v £ 

cover> 2 (v'). □ 

We are now in a position to formally present our notion of hiproof refinement. 

Definition 11. A hiproof h = (V, < i? — t) of type 1 refines to a hiproof h' = 

(P', <|, -4', t 7 ) 0/ £/ie same type, written h Ci h' , iff (V, <i) Cf (V 7 ,<{), 

(V, — > s ) Cf ( V " 7 , — ) and, moreover, labels are preserved: i.e. t C t 7 (regard- 
ing here each oft and t 7 as a finite set of pairs, e.g. ( v , t(v))/. □ 

Definition 12 . A hiproof h = ( V , r, t, /) a/ type 2 refines to a hiproof h! = 

(V 7 , -4, r 7 , t', V) of the same type , written h C 2 h* , iff { V , — r) Cx (V 7 , -> 7 , r 7 ) 

and, moreover , t C t* and l C V . □ 

Theorem 3 . Tet Ai and 6 e hiproofs of type 1, and h 2 and h f 2 be hiproofs of 
type 2. Then, 

1. if hi h[ then fxi 2 (hi) E2 Vi2(hi)> and 

2. if h 2 Q 2 h 2 then p. 2 i(h 2 ) Ei ^21(^2)* D 


The proof is straightforward. The interesting point is that this is evidence that 
we have a natural definition of refinement. 



5 Conclusion 


This paper has made a first study of some structures used to represent hierarchy 
in formal proofs. We have described two different types of hiproof and shown 
their equivalence. There are other possible definitions of hiproof, not considered 
here, but a truly unifying treatment is likely to necessitate a more abstract ap- 
proach, such as a category-theoretic one. Hiproof refinement already introduces 
categorical structure. 

While the subtlety of our two definitions of hiproof clearly reflects the diffi- 
culty of capturing graphical intuitions mathematically, we do believe the under- 
lying concepts to be robust and insightful. The feedback which we have received 
so far seems to support this belief. Making what is graphically implicit mathe- 
matically explicit is a necessary first step for reasoning about these diagrams. 

We readily acknowledge that, in practice, tactics possess a structure far more 
complex than we have studied here. Although it may seem that we have ab- 
stracted too far we believe, nevertheless, that our definitions are at an appro- 
priate level. Firstly, the particular axioms on hiproofs are motivated specifically 
by considering proofs and tactics. Secondly, our aim here has been to study hi- 
erarchy and it makes sense to first study this in a minimal setting. Thirdly, a 
simple graphical interface would, we believe, represent tactics at this level (for 
example, [RSG98]). 

The main result of the paper is not the equivalence of two concrete definitions 
of hiproof, but the fact that we have found a suitable level of abstraction at which 
such equivalences exist. We are currently developing a more axiomatic definition, 
which captures the essential features of these, and other, concrete instances. We 
see the equivalence of two independent notions of hierarchy as evidence that we 
have chosen the correct level of abstraction. 

Another important task is to the define operations on hiproofs, supported in 
the various proof assistants, such as various abstraction operations. Such ‘zoom- 
ing’ operations have been considered for higraphs and statecharts [Har88,APT02], 
and there are natural operations to consider here. 

We are not claiming that our theory will lead to new, or better written tactics 
in existing systems. The advantage, though, of deciding on the general proper- 
ties and operations of hiproofs would be to give a principled way of designing 
an interface which does not depend on the specifics of hiproof representation. 
Moreover, such a theory would allow us to reason about the correctness of an 
implementation. 

We also plan to characterise the relationship between our semantic structures 
and the underlying logic, introducing a notion of stepwise refinement. Finally, 
as we have acknowledged in this paper, our theory only attempts to describe a 
simple notion of tactic and proof. Modern theorem provers make use of many rich 
structures. This work is just a first step towards providing a semantic foundation 
for this area. 



References 


[APT02] Stuart Anderson, John Power, and Konstantinos Tourlas. Zooming out 
of higraph-based diagrams: syntactic and semantic issues. In Proceedings 
of CATS 2002, the Australasian Symposium on Theory of Computing , vol- 
ume 61 of Electronic Notes in Computer Science (ENTCS). Elsevier, 2002. 



Fiedler, Xiaorong Huang, Manfred Kerber, Michael Kohlhase, Karsten Kon- 
rad, Andreas Meier, Erica Melis, Wolf Schaarschmidt, Jorg Siekmann, and 
Volker Sorge. Omega: Towards a mathematical assistant. In Proceedings of 
CADE- 14, volume 1249 of LNAI. Springer, 1997. 

[BH96] Jon Bar wise and Eric Hammer. Diagrams and the concept of logical system. 

In G. Allwein and J. Barwise, editors, Logical Reasoning with Diagrams , 
pages 49-78. Oxford University Press, 1996. 

[Bun96] A. Bundy. Proof planning. In- B. Drabble, editor, Proceedings of the 3rd 
International Conference on A I Planning Systems, (AIPS) 1996 , pages 261- 
267, 1996. Also available as DAI Research Report 886. 

[CS00] L. Cheikhrouhou and V. Sorge. PDS — A Three-Dimensional Data Structure 
for Proof Plans, 2000. 

[Har88] David Harel. On visual formalisms. Communications of the ACM , 3 1(5): 514- 
530, 1988. 

[KNM94] D. Kapur, X. Nie, and D. R. Musser. An overview of the Tecton proof 
system. Theoretical Computer Science, 133(2) :307-340, 1994. 

[RB99] Julian Richardson and Alan Bundy. Proof Planning Meth- 
ods as Schemas. DAI Technical Report, Division of Infor- 
matics, University of Edinburgh, 1999. Also available at 

http : //www. dai.ed.ac.uk/~julianr/proof -planning. ps.gz. 

[RS01] J. D. C. Richardson and A. Smaill. Continuations of proof strategies. In 
International Joint Conference on Automated Reasoning, IJCAR - 2001 — 
Short Papers , June 2001. Technical Report DII 11/01, Dipartimento di 
Ingegneria delFInformazione, Universita di Siena, Italy. 

[RSG98] J. D. C Richardson, A. Smaill, and I. Green. System description: proof 
planning in higher-order logic with Lambda-Clam. In 15th International 
Conference on Automated Deduction, pages 129-133, 1998. 

Appendix: Graphs, trees and forests 


In this section we present the basic order theory together with some novel defi- 
nitions needed for our definitions of hiproof. 

Definition 13. A partially ordered set (or poset for short) {X,<), is a set X 
together with a reflexive, antisymmetric and transitive relation < on X . For each 
x E X the set cover (x) is defined to be 

{x' E X | x 1 < x and for all x n 6 X such that x n < x one has x n < x'} 


and will be referred to as the cover of x. 


□ 



As usual, we write x < y to mean x < y but not x = y. Thus, an element 
y is in the cover of x iff y is strictly less than x and also y* < y for all other y ' 
such that y' < x. 

The graphs we consider here are directed, consisting as usual of a pair (V, E ), 
where V a set (the vertices ), and E CFxFa binary relation on V (the edges). 
Thus, writing E* for the reflexive and transitive closure of the relation E , a path 
"Trom v to v* exists if and only . An lmpoTtant - festriction"on''tW 

graphs we consider is that they contain no cycles (i.e. non-empty paths from any 
vertex to itself). 

Definition 14. A tree T = (V,-¥,r) is a finite dag (V, ->) together with a 
distinguished vertex r 6 V, called the root, such that there is exactly one path 
from r to every other vertex v / r. For every edge (u,t/) E — >, which we shall 
conventionally write as v v’ , one says that v ' is a child of v or, equivalently , 
that v ' has parent v. The vertices V in a tree are conventionally also called 
nodes. □ 

Given any two nodes v and v ' in a tree T = (V, — r) we say that v* is a 
descendent of v , and write v 1 < v, to mean that there is a path in the tree 
from v to v' . Thus, formally, v* < v holds iff v — »* v' . As < turns out to be a 
partial order, it is often most convenient to interchange the graph-theoretic and 
order-theoretic views of trees: 

Proposition 4. To give a tree (in the sense of Def. 14) is equivalent to giving 
a finite poset T = (V, <) which satisfies the ‘ non-sharing ’ condition 

V x,y,z e F. x < y and x < z implies y < z or z < y , 

and has a top element T. The resulting tree is (V, — T), where v -4 v 1 whenever 
v f E cover <(v). 

□ 

Definition 15. A forest F is a finite set {T X} . . . , T n } of trees Tj = (Vj, —tj,rj). 
We shall write v — v' (or just v — > v f when F understood) to mean that there 
exists tree Tj in F such that v,v' E Vj and v — ^ v f . Consequently we shall often 
also write the forest F as (V, where V is the disjoint union of all Vj. □ 

Corollary 1. To give a forest in the sense of Def. 15 is equivalent to giving 
a finite poset (V, <) subject only to the i non-sharing > condition of Prop. 4 : 

V x, y, z E V. x < y and x < z implies y < z or z <y. □ 

Appendix: Technical proofs 

Proof of Lemma 1 

Proof. Using contraposition and the shunting equivalence (p A q => r) <=> 

(p (q r)), the third condition in Def. 2 may equivalently be rewritten 



thus: Vn, v f . siblings i(v> v') A v ^ v ! A isroot s (v) =$> disroot 3 (v'). Taking v to 
be Vo with isroot s (v o), i.e. a n 0 wish no incoming -» s edges, it follows that every 
sibling v' of vo which is not Vo itself cannot be root wrt. -> s , he. n' must have an 
incoming edge. Assuming also v 0 to be <i-maximal, its siblings (excluding 
itself) are precisely all the other <i-maximal nodes, and the result follows. 

Defining v <} v ! as v G cover c i (v f ) and Cf 1 ^ (<j L ) n one proves v — > s 
W\ A w\ <p W 2 => v <? u )2 ? from which the claimed equivalence is easily 
obtained. □ 

Proposition 5. pL 12 above is indeed well-defined , i.e. each /X 12 ( £)) 
conforms to Def. 3. 

Proof. Firstly , by recalling Prop. 1 and observing that — > + C (>* U it 

follows that (V, “») is an acyclic graph. Moreover, whenever v\ — t v and V 2 — > v 
one has V\ = V 2 (for the only possible cases are v\ — > s v and V\ -> s v, or, 
v G cover<[(vi) and v G cover< i(u 2 ); Vi = n 2 immediately follows from (V, ~* s ) 
and (V, <i) being forests). Thus , whenever a path Vo ->* v exists it must be 
unique. We show that indeed r — ** v for all v G V by induction on d(v), the 
‘ depth 7 o/n wrt. ~> s , which is defined thus: d(v) = d(v ') + 1 whenever n' -4 v and 
d(n) — 0 otherwise. When d(v) — 0 Men clearly isroot^. 3 (v) and siblings <.(v, r) 
(in the sense that isroot< .(n)/ Flow r = v by Lemma. 1, and r — n holds 
trivially. In the inductive case assume true for v f andv 1 — >■ n. Then the induction 
hypothesis yields r — * n ; and so, transitively, also r — Y* v. 

Showing that l( v ') < /(d) + 1 whenever v — > n' proceeds by case analysis. 
Case v f G coner< i (u) and isroot-+ s (n') is immediate. When v — » s v 9 one ex- 
amines whether isroot< i (v f ) or not. When so, Z(n') = 0 < l(v) + 1. HMen 
n' G coner<.(n") for some v” , condition 2 yields (via Lemma 1) v f G cover <. {v n ), 

' which l(v) = /(n") + 1 — Z(n') follows. 

Assume v — > and n — >■ n 2 such that /(m) = /(n) -b 1. Then one must have 

v = parent 1 ) and hence V\ <i v in the type 1 hiproof. Further, we distinguish 
two cases : firstly, in the case Z(n 2 ) = Z(n 1 ) one has siblings^. (ni,n 2 ) while also 
isroot-> s (v 1 ) A isroot-y s (v 2 ). TAen condition 4 of Def. 2 establishes Vi = n 2 , as 
required. Similarly the case l(v 2 ) < l(v) means v n 2 while Vi <i v, hence 
Vi = V 2 by the third condition in the definition of type 1 hiproofs. Finally, that 
l(r) = 0 is obvious. □ 

Lemma 2. In the context of Def. 5, isroot- >s (v) is equivalent to Vn 0 G V. (n 0 
| y => l(v 0 ) + 1 < Z(n)). 

Proof . Using the definition of ~^ s and the tautology (jp => g) (~»p V g) ; 

zsr0o£_» 3 (u) ,3no. (no u A l(u) < l(uo)) 

Vn o- (^o A ^ V /(u) > l(no)) 

Vno- (no •/> v V /(n) > /(no) + 1) 

Vno- (no — > n => /(no) + 1 < /(^)) D 

Proposition 6. (121 above is well-defined, i.e. each p 2 i((V, ->,r, £, /)) 25 a hiproof 
of type 1. 

\ 

1 


I 



Proof. (Sketch) <\ is manifestly irrefiexive and transitive. On the other hand, 
<i is clearly antisymmetric , as follows from observing that <\C while 

is a tree. Thus , the definition of <\ as the reflexive closure of <\ makes 
(V, <i) a poset. The non- sharing condition, needed by Corol. 1 to show { V \ <i) 
a forest , as required, also follows from being a tree and <iC (— 

to assume v <i w x and v <\w 2 while w x ^ w 2 would mean the existence of two 

thus contradicting the fact of (V, -4,r) being a tree. One must therefore admit 
that, whenever v <j w x and v <\w 2 , w x must equal w 2 . 

To show (V,-*s) a forest, as required, we shall show that (V, (-**)” 1 ) is a 
forest- qua-po set and appeal to Corol. 1. The poset structure of (V, (— is 
immediate as -* is clearly antisymmetric. Again, observing that the 

‘ non-sharing ’ condition required by Corol. 1 follows, as above, from (V,— >,r) 
being a tree . 

To show that <\ and — > s are mutually exclusive in the sense of condition (1) 
of Def. 2, consider first the case of v <\ w and v w: as the former implies 
w ->* v and the latter implies v — »* w, v = w follows easily from the acyclicity 
of the tree ( V ] -*,r). In the case ofv <\ w (hence also l(w) < l(v)) while w -»* v 
(and hence l(v) < l(w)), one must have l(v) = l(w) and so, according to the 
definition of <\, v = w. 

To establish condition 2 we appeal to Lemma 1. First, observe that w x 6 
cover <i(tu 2 ) means the existence of a non-empty path w 2 v x . . . — > v n w x 
in the tree ( V , -*,r) such that l(v i) = l(w 2 ) + 1 and l(v i) = . . . = l(v n ) = l{w i)- 
Assuming also that v — > s w x , i.e. also v — > w x , forces v = v n , for ( V , ->,r) is a 
tree. That v £ cover <fiw 2 ) now follows immediately from Def. 5. 

For condition 3, suppose that w x <* v and v w 2 . We must show that 
v — w x . By the definition of \i 2 x , we have that w x (<J)*u and v w 2 , with 
l{w 2 ) < l(y). Suppose that the path from w x to v is non-empty, i.e. w x <] 
w o for some w' 0 . Then by the Definition of <• there exists a Wo such 

that v Wo and 1 (ujq) = l(v) + 1. Now, by condition 3 of Def. 3, we have that 
w 2 = Wo, but this is impossible because they have different levels. Therefore the 
path from w x to v must be be empty, and so w x = v. 

For showing condition 4 assume first that siblings <i (v,v'). Then either v — 
v r , or else (by unfolding the definition of cover <J there must exist vo such that, 
at least, vo — > v, and vo — ► v' . Thus, by condition 2 of Def. 3, l(v) < l(v o) + 1 
and l(v f ) < l(vo) + 1 also hold. Assuming further that isroot^fiv) Aisroot^fiv 1 ), 
Vo — ^ v and vo — * v* additionally yield, by Lemma 2, that l(v 0 ) + 1 < l(v) and 
l(v o) + 1 < l{v'). Hence, l(v) = l(v f ) — 1(vq) + 1 and, by condition (3) of Def. 3, 
it now follows that v x — v 2 , as required. □ 



